首页> 外文OA文献 >Inverted Bilingual Topic Models for Lexicon Extraction from Non-parallel Data

【2h】

Inverted Bilingual Topic Models for Lexicon Extraction from Non-parallel Data

机译：非平行词汇提取的反向双语主题模型数据

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Topic models have been successfully applied in lexicon extraction. However,most previous methods are limited to document-aligned data. In this paper, wetry to address two challenges of applying topic models to lexicon extraction innon-parallel data: 1) hard to model the word relationship and 2) noisy seeddictionary. To solve these two challenges, we propose two new bilingual topicmodels to better capture the semantic information of each word whilediscriminating the multiple translations in a noisy seed dictionary. We extendthe scope of topic models by inverting the roles of "word" and "document". Inaddition, to solve the problem of noise in seed dictionary, we incorporate theprobability of translation selection in our models. Moreover, we also proposean effective measure to evaluate the similarity of words in different languagesand select the optimal translation pairs. Experimental results using real worlddata demonstrate the utility and efficacy of the proposed models.

机译：主题模型已成功应用于词典提取中。但是，大多数先前的方法仅限于文档对齐的数据。在本文中，我们尝试解决将主题模型应用于非并行数据的词典提取中的两个挑战：1）难以对单词关系进行建模； 2）嘈杂的种子词典。为了解决这两个挑战，我们提出了两个新的双语主题模型，以更好地捕获每个单词的语义信息，同时在嘈杂的种子字典中区分多种翻译。我们通过反转“单词”和“文档”的作用来扩展主题模型的范围。另外，为了解决种子字典中的噪音问题，我们将翻译选择的概率纳入我们的模型中。此外，我们还提出了一种有效的措施来评估不同语言中单词的相似度并选择最佳翻译对。使用真实世界数据的实验结果证明了所提出模型的实用性和有效性。

著录项

作者
Ma, Tengfei; Nasukawa, Tetsuya;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Multilingual Topic Models for Bilingual Dictionary Extraction [J] . XIAODONG LIU, KEVIN DUH, YUJI MATSUMOTO ACM transactions on Asian language information processing . 2015,第3期

机译：双语词典提取的多语言主题模型
2. Online Knowledge-Based Model for Big Data Topic Extraction [J] . Khan Muhammad Taimoor, Durrani Mehr, Khalid Shehzad, Computational intelligence and neuroscience . 2016,第Pta2期

机译：基于在线知识的大数据主题提取模型
3. Use of a Latent Topic Model for Characteristic Extraction from Health Checkup Questionnaire Data [J] . Hatakeyama Y., Miyano I., Kataoka H., Methods of information in medicine . 2015,第6期

机译：使用潜在主题模型从健康检查问卷数据中提取特征
4. Inverted Bilingual Topic Models for Lexicon Extraction from Non-parallel Data [C] . Tengfei Ma, Tetsuya Nasukawa International Joint Conference on Artificial Intelligence . 2019

机译：来自非并行数据的词典提取的反转双语主题模型
5. Semantic ambiguity in the lexical access of verbs: How data from monolinguals and bilinguals inform a general model of the mental lexicon. [D] . Swanson, Amy Phyllis. 2010

机译：动词的词汇访问中的语义歧义：单语和双语数据如何告知心理词典的一般模型。
6. Online Knowledge-Based Model for Big Data Topic Extraction [O] . Muhammad Taimoor Khan, Mehr Durrani, Shehzad Khalid, 2016

机译：基于在线知识的大数据主题提取模型
7. Bilingual word embeddings from non-parallel document-aligned data applied to bilingual lexicon induction [O] . Vulic Ivan, Moens Marie-Francine 2015

机译：来自非平行文档对齐数据的双语词嵌入应用于双语词典归纳

Inverted Bilingual Topic Models for Lexicon Extraction from Non-parallel Data

摘要

著录项

相似文献

相关主题

期刊订阅